A Comparison of Generated Wikipedia Profiles Using Social Labeling and Automatic Keyword Extraction
نویسندگان
چکیده
In many collaborative systems, researchers are interested in creating representative user profiles. In this paper, we are particularly interested in using social labeling and automatic keyword extraction techniques for generating user profiles. Social labeling is a process in which users manually tag other users with keywords. Automatic keyword extraction is a technique that selects the most salient words to represent a user’s contribution. We apply each of these two profile generation methods to highly active Wikipedia editors and their contributions, and compare the results. We found that profiles generated through social labeling matches the profiles generated via automatic keyword extraction, and vice versa. The results suggest that user profiles generated from one method can be used as a seed or bootstrapping proxy for the other method.
منابع مشابه
A Knowledge-Base Oriented Approach for Automatic Keyword Extraction
Automatic keyword extraction is an important subfield of information extraction process. It is a difficult task, where numerous different techniques and resources have been proposed. In this paper, we propose a generic approach to extract keyword from documents using encyclopedic knowledge. Our two-step approach first relies on a classification step for identifying candidate keywords followed b...
متن کاملAdvertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles
When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...
متن کاملAutomatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation
Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...
متن کاملIntroduction to the RKEA Package
A short introduction to the RKEA package. Introduction The RKEA package provides a R interface to Kea (http://www.nzdl.org/Kea/), a tool for keyword extraction in texts. See https://code.google.com/p/kea-algorithm/ and http: //www.nzdl.org/Kea/Download/Kea-5.0-Readme.txt for further information on Kea. Note that Maui (http://maui-indexer.googlecode.com/), an algorithm for topic indexing, can be...
متن کاملAutomatic Content-Based Categorization of Wikipedia Articles
Wikipedia’s article contents and its category hierarchy are widely used to produce semantic resources which improve performance on tasks like text classification and keyword extraction. The reverse – using text classification methods for predicting the categories of Wikipedia articles – has attracted less attention so far. We propose to “return the favor” and use text classifiers to improve Wik...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010